145 research outputs found

    Implicit Langevin Algorithms for Sampling From Log-concave Densities

    Full text link
    For sampling from a log-concave density, we study implicit integrators resulting from θ\theta-method discretization of the overdamped Langevin diffusion stochastic differential equation. Theoretical and algorithmic properties of the resulting sampling methods for θ∈[0,1] \theta \in [0,1] and a range of step sizes are established. Our results generalize and extend prior works in several directions. In particular, for θ≥1/2\theta\ge1/2, we prove geometric ergodicity and stability of the resulting methods for all step sizes. We show that obtaining subsequent samples amounts to solving a strongly-convex optimization problem, which is readily achievable using one of numerous existing methods. Numerical examples supporting our theoretical analysis are also presented

    Monte Carlo Estimation of the Density of the Sum of Dependent Random Variables

    Full text link
    We study an unbiased estimator for the density of a sum of random variables that are simulated from a computer model. A numerical study on examples with copula dependence is conducted where the proposed estimator performs favourably in terms of variance compared to other unbiased estimators. We provide applications and extensions to the estimation of marginal densities in Bayesian statistics and to the estimation of the density of sums of random variables under Gaussian copula dependence

    Unbiased and Consistent Nested Sampling via Sequential Monte Carlo

    Full text link
    We introduce a new class of sequential Monte Carlo methods called Nested Sampling via Sequential Monte Carlo (NS-SMC), which reframes the Nested Sampling method of Skilling (2006) in terms of sequential Monte Carlo techniques. This new framework allows convergence results to be obtained in the setting when Markov chain Monte Carlo (MCMC) is used to produce new samples. An additional benefit is that marginal likelihood estimates are unbiased. In contrast to NS, the analysis of NS-SMC does not require the (unrealistic) assumption that the simulated samples be independent. As the original NS algorithm is a special case of NS-SMC, this provides insights as to why NS seems to produce accurate estimates despite a typical violation of its assumptions. For applications of NS-SMC, we give advice on tuning MCMC kernels in an automated manner via a preliminary pilot run, and present a new method for appropriately choosing the number of MCMC repeats at each iteration. Finally, a numerical study is conducted where the performance of NS-SMC and temperature-annealed SMC is compared on several challenging and realistic problems. MATLAB code for our experiments is made available at https://github.com/LeahPrice/SMC-NS .Comment: 45 pages, some minor typographical errors fixed since last versio

    Federated Variational Inference Methods for Structured Latent Variable Models

    Full text link
    Federated learning methods enable model training across distributed data sources without data leaving their original locations and have gained increasing interest in various fields. However, existing approaches are limited, excluding many structured probabilistic models. We present a general and elegant solution based on structured variational inference, widely used in Bayesian machine learning, adapted for the federated setting. Additionally, we provide a communication-efficient variant analogous to the canonical FedAvg algorithm. The proposed algorithms' effectiveness is demonstrated, and their performance is compared with hierarchical Bayesian neural networks and topic models

    Advances in Monte Carlo methodology

    Get PDF

    Graph Neural Network-Based Anomaly Detection for River Network Systems

    Full text link
    Water is the lifeblood of river networks, and its quality plays a crucial role in sustaining both aquatic ecosystems and human societies. Real-time monitoring of water quality is increasingly reliant on in-situ sensor technology. Anomaly detection is crucial for identifying erroneous patterns in sensor data, but can be a challenging task due to the complexity and variability of the data, even under normal conditions. This paper presents a solution to the challenging task of anomaly detection for river network sensor data, which is essential for accurate and continuous monitoring. We use a graph neural network model, the recently proposed Graph Deviation Network (GDN), which employs graph attention-based forecasting to capture the complex spatio-temporal relationships between sensors. We propose an alternate anomaly scoring method, GDN+, based on the learned graph. To evaluate the model's efficacy, we introduce new benchmarking simulation experiments with highly-sophisticated dependency structures and subsequence anomalies of various types. We further examine the strengths and weaknesses of this baseline approach, GDN, in comparison to other benchmarking methods on complex real-world river network data. Findings suggest that GDN+ outperforms the baseline approach in high-dimensional data, while also providing improved interpretability. We also introduce software called gnnad

    A PAC-Bayesian Perspective on the Interpolating Information Criterion

    Full text link
    Deep learning is renowned for its theory-practice gap, whereby principled theory typically fails to provide much beneficial guidance for implementation in practice. This has been highlighted recently by the benign overfitting phenomenon: when neural networks become sufficiently large to interpolate the dataset perfectly, model performance appears to improve with increasing model size, in apparent contradiction with the well-known bias-variance tradeoff. While such phenomena have proven challenging to theoretically study for general models, the recently proposed Interpolating Information Criterion (IIC) provides a valuable theoretical framework to examine performance for overparameterized models. Using the IIC, a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence generalization performance in the interpolating regime. From the provided bound, we quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, optimizer, and parameter-initialization scheme; the spectrum of the empirical neural tangent kernel; curvature of the loss landscape; and noise present in the data.Comment: 9 page

    Continuously-Tempered PDMP samplers

    Get PDF
    New sampling algorithms based on simulating continuous-time stochastic processes called piecewise deterministic Markov processes (PDMPs) have shown considerable promise. However, these methods can struggle to sample from multi-modal or heavy-tailed distributions. We show how tempering ideas can improve the mixing of PDMPs in such cases. We introduce an extended distribution defined over the state of the posterior distribution and an inverse temperature, which interpolates between a tractable distribution when the inverse temperature is 0 and the posterior when the inverse temperature is 1. The marginal distribution of the inverse temperature is a mixture of a continuous distribution on [0,1) and a point mass at 1: which means that we obtain samples when the inverse temperature is 1, and these are draws from the posterior, but sampling algorithms will also explore distributions at lower temperatures which will improve mixing. We show how PDMPs, and particularly the Zig-Zag sampler, can be implemented to sample from such an extended distribution. The resulting algorithm is easy to implement and we show empirically that it can outperform existing PDMP-based samplers on challenging multimodal posteriors
    • …
    corecore